Code component debugging in an application program

ABSTRACT

Disclosed aspects relate to debugging a set of code components of an application program. A set of defect data which indicates a set of defects may be collected with respect to an application program. The set of defect data may be derived from a set of post-compilation users of the application program. A set of test case data which indicates a set of user interface features of the application program may be collected with respect to the application program. The set of test case data may be derived from a set of development tests of the application program. Using both the set of defect data and the set of test case data, a set of fragility data for the set of code components of the application program may be determined. Based on the set of fragility data, the set of code components of the application program may be debugged.

BACKGROUND

This disclosure relates generally to computer systems and, moreparticularly, relates to debugging a set of code components in anapplication program. Application programs may be used to carry out avariety of functions. The complexity of code components in applicationprograms is increasing. As the complexity of code components inapplication programs increases, the need for debugging of codecomponents may also increase.

SUMMARY

Aspects of the disclosure relate to debugging a set of code componentsof an application program. The degree of fragility of code componentsmay be determined based on defect data included in application-storereviews. Application-store reviews may be analyzed to mine informationthat indicates defects in an application program. Based on the defectdata extracted from the application store reviews, a group ofdevelopment tests may be identified. Using the group of developmenttests, defects mentioned in the application-store reviews may be mappedwith one or more code components of an application program. A fragilityscore for the identified code components may be calculated based on thecorrelation between the defects, the correlation between the developmenttests, the defect criticality, cluster density, codecomplexity/inter-dependency, or other factors. An analytics-basedcomponent may periodically sync with the application-store reviewupdates to compute updated fragility scores for the code components.

Disclosed aspects relate to debugging a set of code components of anapplication program. A set of defect data which indicates a set ofdefects may be collected with respect to an application program. The setof defect data may be derived from a set of post-compilation users ofthe application program. A set of test case data which indicates a setof user interface features of the application program may be collectedwith respect to the application program. The set of test case data maybe derived from a set of development tests of the application program.Using both the set of defect data and the set of test case data, a setof fragility data for the set of code components of the applicationprogram may be determined. Based on the set of fragility data, the setof code components of the application program may be debugged.

The above summary is not intended to describe each illustratedembodiment or every implementation of the present disclosure.

BRIEF DESCRIPTION OF THE DRAWINGS

The drawings included in the present application are incorporated into,and form part of, the specification. They illustrate embodiments of thepresent disclosure and, along with the description, serve to explain theprinciples of the disclosure. The drawings are only illustrative ofcertain embodiments and do not limit the disclosure.

FIG. 1 depicts a high-level block diagram of a computer system forimplementing various embodiments of the present disclosure, according toembodiments.

FIG. 2 is a flowchart illustrating a method for debugging a set of codecomponents of an application program, according to embodiments.

FIG. 3 is a flowchart illustrating a method for debugging a set of codecomponents of an application program, according to embodiments.

FIG. 4 is a flowchart illustrating a method for debugging a set of codecomponents of an application program, according to embodiments.

FIG. 5 is a diagram illustrating a correlation between a set of defectdata and a set of test case data, according to embodiments.

FIG. 6 is a diagram illustrating a correlation between a set of defectdata and a set of code components using a set of test case data,according to embodiments.

FIG. 7 is a flowchart illustrating a method for debugging a set of codecomponents of an application program, according to embodiments.

FIG. 8 is a diagram illustrating a defect-code component distribution,according to embodiments.

FIG. 9 is a flowchart illustrating a method for debugging a set of codecomponents of an application program, according to embodiments.

While the invention is amenable to various modifications and alternativeforms, specifics thereof have been shown by way of example in thedrawings and will be described in detail. It should be understood,however, that the intention is not to limit the invention to theparticular embodiments described. On the contrary, the intention is tocover all modifications, equivalents, and alternatives falling withinthe spirit and scope of the invention.

DETAILED DESCRIPTION

Aspects of the disclosure relate to debugging a set of code componentsof an application program. The degree of fragility of code componentsmay be determined based on defect data included in application-storereviews. Application-store reviews may be analyzed to mine informationthat indicates defects in an application program. Based on the defectdata extracted from the application store reviews, a group ofdevelopment tests (e.g., test cases) may be identified. Using the groupof development tests, defects mentioned in the application-store reviewsmay be mapped with one or more code components of an applicationprogram. A fragility score for the identified code components may becalculated based on the correlation between the defects, the correlationbetween the development tests, the defect criticality, cluster density,code complexity/inter-dependency, or other factors. An analytics-basedcomponent may periodically sync with the application-store reviewupdates to compute updated fragility scores for the code components.Leveraging application-store reviews for determining the degree offragility of code components may be associated with defectidentification, code debugging efficiency, and application programreliability.

Application-stores are one location where users may share data andinformation regarding defects, bugs, glitches, or other errors inapplication programs. Aspects of the disclosure relate to analyzingapplication-store reviews and mining defect information from them toidentify defects in application programs. The defect information mayinclude information regarding a description of the defect, the defectcorrelation, and the criticality of the defect. For instance, a user maypost a review stating that “The login screen is not working,” or “Theapp crashes whenever I click the ‘submit’ button.” Aspects of thedisclosure relate to collecting and analyzing a set of application-storereviews and mapping them to a set of test cases (e.g., developmenttests) performed on the application program during the developmentphase. For instance, top-k neighborhood algorithms may be used toretrieve top-k cases for each defect identified based on theapplication-store reviews, and one or more top-k test cases for eachdefect may be selected based on the semantic similarity in the widgettext between defects and the test cases.

Based on the test cases and the application-store reviews, the testcases may be correlated with corresponding code components which areexecuted during test-case playback. Method and component runtime logsfor each test case may be collected, and the code components may bemapped with the defects identified from the application-store reviews.Aspects of the disclosure relate to determining a score (e.g., fragilitydata) for each of the code components to indicate the degree offragility of the each code component. The fragility data may becalculated based on one or more parameters including a correlationbetween the defects, a correlation between test-cases, a defectcriticality, a cluster density and correlation for test cases, or a codecomplexity and inter-dependency based on data/control flow analysis. Bysorting and classifying the fragility data for a set of code components,fragile code components may be detected and prioritized (e.g., fordebugging). In some cases, when new reviews become available in theapplication store for a particular application program, the fragilitydata for the set of code components may be recomputed.

Aspects of the disclosure relate to a system, method, and computerprogram product for debugging a set of code components of an applicationprogram. A set of defect data which indicates a set of defects may becollected. The set of defect data may be derived from a set ofpost-compilation users of the application program. A set of test casedata which indicates a set of user interface features of the applicationprogram may be collected. The set of test case data may be derived froma set of development tests of the application program. Using both theset of defect data and the set of test case data, a set of fragilitydata for the set of code components of the application program may bedetermined. In embodiments, the set of fragility data may indicate a setof fragility extents for the set of code components of the applicationprogram. In embodiments, the set of fragility data may indicate a set offragility nature-types for the set of code components of the applicationprogram.

Aspects of the disclosure relate to debugging the set of code componentsof the application program based on the fragility data for the set ofcode components of the application program. In embodiments, debuggingthe set of code components may include establishing a breakpoint linkedwith the set of code components of the application program in anautomated fashion based on the set of fragility data. In response totriggering the breakpoint, the set of code components of the applicationprogram linked with the breakpoint may be presented, and the set of codecomponents of the application program may be modified. In embodiments,aspects of the disclosure relate to retrieving a set of updated defectdata using a synchronization criterion, and using both the set ofupdated defect data and the set of test case data to determine a set ofupdated fragility data for the set of code components of the applicationprogram. The set of code components may be debugged based on the set ofupdated fragility data. Altogether, aspects of the disclosure can haveperformance or efficiency benefits (e.g., reliability, speed,flexibility, responsiveness, stability, high availability, resourceusage, productivity). Aspects may save resources such as bandwidth,disk, processing, or memory.

Turning now to the figures, FIG. 1 depicts a high-level block diagram ofa computer system for implementing various embodiments of the presentdisclosure, according to embodiments. The mechanisms and apparatus ofthe various embodiments disclosed herein apply equally to anyappropriate computing system. The major components of the computersystem 100 include one or more processors 102, a memory 104, a terminalinterface 112, a storage interface 114, an I/O (Input/Output) deviceinterface 116, and a network interface 118, all of which arecommunicatively coupled, directly or indirectly, for inter-componentcommunication via a memory bus 106, an I/O bus 108, bus interface unit109, and an I/O bus interface unit 110.

The computer system 100 may contain one or more general-purposeprogrammable central processing units (CPUs) 102A and 102B, hereingenerically referred to as the processor 102. In embodiments, thecomputer system 100 may contain multiple processors; however, in certainembodiments, the computer system 100 may alternatively be a single CPUsystem. Each processor 102 executes instructions stored in the memory104 and may include one or more levels of on-board cache.

In embodiments, the memory 104 may include a random-access semiconductormemory, storage device, or storage medium (either volatile ornon-volatile) for storing or encoding data and programs. In certainembodiments, the memory 104 represents the entire virtual memory of thecomputer system 100, and may also include the virtual memory of othercomputer systems coupled to the computer system 100 or connected via anetwork. The memory 104 can be conceptually viewed as a singlemonolithic entity, but in other embodiments the memory 104 is a morecomplex arrangement, such as a hierarchy of caches and other memorydevices. For example, memory may exist in multiple levels of caches, andthese caches may be further divided by function, so that one cache holdsinstructions while another holds non-instruction data, which is used bythe processor or processors. Memory may be further distributed andassociated with different CPUs or sets of CPUs, as is known in any ofvarious so-called non-uniform memory access (NUMA) computerarchitectures.

The memory 104 may store all or a portion of the various programs,modules and data structures for processing data transfers as discussedherein. For instance, the memory 104 can store a code componentdebugging application 150. In embodiments, the code component debuggingapplication 150 may include instructions or statements that execute onthe processor 102 or instructions or statements that are interpreted byinstructions or statements that execute on the processor 102 to carryout the functions as further described below. In certain embodiments,the code component debugging application 150 is implemented in hardwarevia semiconductor devices, chips, logical gates, circuits, circuitcards, and/or other physical hardware devices in lieu of, or in additionto, a processor-based system. In embodiments, the code componentdebugging application 150 may include data in addition to instructionsor statements.

The computer system 100 may include a bus interface unit 109 to handlecommunications among the processor 102, the memory 104, a display system124, and the I/O bus interface unit 110. The I/O bus interface unit 110may be coupled with the I/O bus 108 for transferring data to and fromthe various I/O units. The I/O bus interface unit 110 communicates withmultiple I/O interface units 112, 114, 116, and 118, which are alsoknown as I/O processors (IOPs) or I/O adapters (IOAs), through the I/Obus 108. The display system 124 may include a display controller, adisplay memory, or both. The display controller may provide video,audio, or both types of data to a display device 126. The display memorymay be a dedicated memory for buffering video data. The display system124 may be coupled with a display device 126, such as a standalonedisplay screen, computer monitor, television, or a tablet or handhelddevice display. In one embodiment, the display device 126 may includeone or more speakers for rendering audio. Alternatively, one or morespeakers for rendering audio may be coupled with an I/O interface unit.In alternate embodiments, one or more of the functions provided by thedisplay system 124 may be on board an integrated circuit that alsoincludes the processor 102. In addition, one or more of the functionsprovided by the bus interface unit 109 may be on board an integratedcircuit that also includes the processor 102.

The I/O interface units support communication with a variety of storageand I/O devices. For example, the terminal interface unit 112 supportsthe attachment of one or more user I/O devices 120, which may includeuser output devices (such as a video display device, speaker, and/ortelevision set) and user input devices (such as a keyboard, mouse,keypad, touchpad, trackball, buttons, light pen, or other pointingdevice). A user may manipulate the user input devices using a userinterface, in order to provide input data and commands to the user I/Odevice 120 and the computer system 100, and may receive output data viathe user output devices. For example, a user interface may be presentedvia the user I/O device 120, such as displayed on a display device,played via a speaker, or printed via a printer.

The storage interface 114 supports the attachment of one or more diskdrives or direct access storage devices 122 (which are typicallyrotating magnetic disk drive storage devices, although they couldalternatively be other storage devices, including arrays of disk drivesconfigured to appear as a single large storage device to a hostcomputer, or solid-state drives, such as flash memory). In someembodiments, the storage device 122 may be implemented via any type ofsecondary storage device. The contents of the memory 104, or any portionthereof, may be stored to and retrieved from the storage device 122 asneeded. The I/O device interface 116 provides an interface to any ofvarious other I/O devices or devices of other types, such as printers orfax machines. The network interface 118 provides one or morecommunication paths from the computer system 100 to other digitaldevices and computer systems; these communication paths may include,e.g., one or more networks 130.

Although the computer system 100 shown in FIG. 1 illustrates aparticular bus structure providing a direct communication path among theprocessors 102, the memory 104, the bus interface 109, the displaysystem 124, and the I/O bus interface unit 110, in alternativeembodiments the computer system 100 may include different buses orcommunication paths, which may be arranged in any of various forms, suchas point-to-point links in hierarchical, star or web configurations,multiple hierarchical buses, parallel and redundant paths, or any otherappropriate type of configuration. Furthermore, while the I/O businterface unit 110 and the I/O bus 108 are shown as single respectiveunits, the computer system 100 may, in fact, contain multiple I/O businterface units 110 and/or multiple I/O buses 108. While multiple I/Ointerface units are shown, which separate the I/O bus 108 from variouscommunications paths running to the various I/O devices, in otherembodiments, some or all of the I/O devices are connected directly toone or more system I/O buses.

In various embodiments, the computer system 100 is a multi-usermainframe computer system, a single-user system, or a server computer orsimilar device that has little or no direct user interface, but receivesrequests from other computer systems (clients). In other embodiments,the computer system 100 may be implemented as a desktop computer,portable computer, laptop or notebook computer, tablet computer, pocketcomputer, telephone, smart phone, or any other suitable type ofelectronic device.

FIG. 2 is a flowchart illustrating a method 200 for debugging a set ofcode components of an application program. Aspects of FIG. 2 relate todetermining and debugging a set of code components based on defect datacollected from application-store reviews for an application program. Theapplication program may include a collection of programming code orother computing instructions for implementing a specific task oroperation. For instance, the application program may include accountingsoftware, billing management software, supply chain management software,enterprise asset management or resource planning software, databasemanagement software, or other types of computer programs. Inembodiments, the application program may be distributed through anapplication store (e.g., online marketplace through which users canpurchase and post reviews for application programs). Aspects of thedisclosure relate to the recognition that, in embodiments, reviewsposted for an application program on an application store may includeinformation regarding defects or errors of the application program.Accordingly, aspects of the disclosure relate to mining data fromapplication store reviews to identify fragile code components (e.g.,portions of code associated with a high density of errors) and performdebugging operations. Leveraging application-store reviews fordetermining the degree of fragility of code components may be associatedwith defect identification, code debugging efficiency, and applicationprogram reliability. The method 200 may begin at block 201.

In embodiments, aspects of the disclosure relate to debugging a set ofcode components. Generally, the set of code components may includesegments, blocks, lines, sections, chunks or other portions ofprogramming code of an application program. In embodiments, the set ofcode components may include a group of consecutively located lines ofcode. For instance, a particular code component may include theprogramming code located between lines 37 and 84 of a source codedocument. In embodiments, the set of code components may include a setof computer instructions configured to provide a particular function orgroup of related functions (e.g., distributed segments of code may berelated to implementation of the same function). As examples, the set ofcode components may include a group of distributed lines of code thatrelate to displaying a calculator function, enabling file upload (e.g.,to a server), facilitating user login (e.g., to a service) or the like.In embodiments, the set of code components may include a group ofmodules (e.g., self-contained portions of code configured to execute aparticular aspect of a feature or function). For instance, the set ofcode components may include a first module configured to handle aprocedure for user authentication, and a second module configured forfacilitating database access for the authenticated user. In certainembodiments, code components may include one or more classes (e.g.,extensible program-code templates providing initial values for states orbehaviors), methods (e.g., procedure associated with a data object), orfeatures (e.g., functions or operations). Other types of code componentsare also possible.

In embodiments, the collecting of the set of defect data, the collectingof the set of test case data, the determining of the set of fragilitydata, and other steps described herein may each occur in an automatedfashion without user intervention at block 204. In embodiments, thecollecting of the set of defect data, the collecting of the set of testcase data, the determining of the set of fragility data, and other stepsdescribed herein may be carried out by an internal code componentdebugging module maintained in a persistent storage device of a localcomputing device (e.g., computer or server connected to a localnetwork). In certain embodiments, the collecting of the set of defectdata, the collecting of the set of test case data, the determining ofthe set of fragility data, and other steps described herein may becarried out by an external code component debugging module hosted by aremote computing device or server (e.g., server accessible via asubscription, usage-based, or other service model). In this way, aspectsof code component debugging may be performed using automated computingmachinery without manual action. Other methods of performing the stepsdescribed herein are also possible.

At block 210, a set of defect data which indicates a set of defects maybe collected with respect to the application program. The set of defectdata may be derived from a set of post-compilation users of theapplication program. Generally, collecting can include gathering,aggregating, accumulating, or otherwise acquiring the set of defectdata. Aspects of the disclosure relate to the recognition that, inembodiments, defects or errors of an application program may discoveredafter completion of the development process (e.g., by users of theapplication program). Accordingly, aspects of the disclosure relate tocollecting a set of defect data and using it to facilitate debugging ofthe application program. In embodiments, collecting the set of defectdata may include mining data from application-store reviews posted byusers of the application program. For instance, a written review thatdescribes a problem regarding a data submission interface of anapplication program may be extracted together with posted screenshotsillustrating the nature of the defect. The set of defect data mayinclude information that indicates errors, bugs, glitches, or otherirregularities of the application program. For instance, the set ofdefect data may include textual data (e.g., written description ofdefects), image/video data (e.g., screenshots or other visualillustrations), rating data (e.g., assessments of the quality of theapplication program) and other types of information. In embodiments, theset of defect data may indicate particular defects (e.g., applicationfreezes, login failures, time-outs) of the application program. Inembodiments, the set of defect data may be derived from a set ofpost-compilation users (e.g., non-developer users). The set ofpost-compilation users may include individuals using the applicationprogram as a published product (e.g., as an alpha release, beta release,final version). As an example, collecting the set of defect data mayinclude aggregating product reviews posted on an applicationdistribution site by users who purchased the application program ascustomers. Other methods of collecting the set of defect data are alsopossible.

Consider the following example. An application program may include amobile software app for checking train schedules in a metropolitan area.The application program may be distributed to users via a mobileapplication store. Users of the application may post reviews on themobile application store describing their experiences with theapplication, as well as bugs or glitches they have encountered. Forinstance, a first user review may include a textual description thatexplains that the application miscalculates train fare for travelingfrom a first station to a second station, but displays the correct valuewhen the origin station and the destination are reversed. The userreview may also include screenshots that illustrate the miscalculatedtrain fare. As described herein, collecting the set of defect data mayinclude identifying the first user review (e.g., the textual descriptionas well as the screenshot) together with other user reviews that relateto the application program, and extracting defect data from the set ofreviews (e.g., using natural language processing or other contentanalysis techniques to identify the defects). Other methods ofcollecting the set of defect data are also possible.

At block 230, a set of test case data which indicates a set of userinterface features of the application program may be collected. The setof test case data may be derived from a set of development tests of theapplication program. Generally, collecting can include gathering,aggregating, identifying, finding, selecting or otherwise acquiring theset of test case data. The set of test case data may include informationregarding archived parameter configurations in which the application wastested during development. For instance, the set of test case data mayspecify which aspects (e.g., systems, code components, modules) of theapplication program were tested, how they were tested, the configurationof the testing environment, the outcome of the tests, writtendescriptions/comments explaining how and why the application was tested,and other types of data. The set of test case data may indicate a set ofuser interface features (e.g., data objects, interface elements, otheraspects) of the application program. For instance, the set of test casedata may reference a particular screen, page, button, image, orinterface element of the application program. In embodiments, the set oftest case data may be derived from a set of development tests of theapplication program. The set of development tests may includeexperiments, investigations, evaluations, or other analyses performed toassess one or more aspects of the application program during development(e.g., by developers of the application program prior to publicrelease). In embodiments, collecting the set of test case data mayinclude searching an archived test suite (e.g., organized series oftests used to evaluate the behavior of an application program) for oneor more test cases that relate to a particular defect indicated by theset of defect data. For instance, a set of test case data relating to“login procedures” may be identified and collected from the test suitebased on a set of defect data that indicates a problem with the loginprocedure of the application program. Other methods of collecting theset of test case data are also possible.

Consider the following example. A set of defect data may be collectedwhich indicates a defect with respect to the “User Profile DataSubmission” screen of an application program. For instance, the set ofdefect data may include a written description that explains how, afterentering their information, a user presses the “Save” button, but thescreen simply refreshes, deleting the received data without saving it ina profile for the user. Based on the set of defect data, a test suitefor the application program may be searched for test cases that pertainto “User Profile Data Entry,” “Data Submission Procedure,” and otherrelated test cases. In embodiments, the set of test cases may indicatehow the code components configured to implement the “User Profile DataSubmission” screen were tested during development. For instance, onetest case may indicate that the code components were tested for firstnames up to 8 characters (e.g., first names exceeding 8 characters werenot tested). As such, it may be determined that the “User Profile DataSubmission” screen may encounter an error when names longer than 8characters (e.g., Josephine, Alexander, Montgomery) are entered. Othermethods of collecting a set of test case data are also possible.

At block 250, a set of fragility data may be determined for the set ofcode components of the application program. The set of fragility datamay be determined using both the set of defect data and the set of testcase data. Generally, determining can include computing, calculating,formulating, generating, or otherwise ascertaining the set of fragilitydata. The set of fragility data may include a quantitative orqualitative indication of the sensitivity, proclivity to malfunction,likelihood to behave irregularly, or error frequency of the set of codecomponents. For instance, the set of fragility data may indicate that aparticular code component has a high likelihood to malfunction inparticular operating configurations, is easily affected by changes toother code components, or could impact a significant number of othercode components if changed (e.g., code components considered to be“fragile” may be responsible for a relatively high proportion of defectsin the program application). In embodiments, determining the set offragility data may include generating an index that associates thenumber (e.g., total amount), frequency (e.g., number per given time), orcriticality (e.g., severity) of defects (e.g., as indicated by the setof defect data) with different code components (e.g., as identified byset of test case data). The indexed results (e.g., code components andassociated defect information) may be aggregated and arranged in aqualitative or quantitative fashion to indicate the relation between oneor more code components of the set of components and the set of defects.In this way, the code components that are associated with a greaternumber of defects, a greater malfunction frequency, or defects ofrelatively greater severity may be identified, and the set of fragilitydata may be generated to characterize the relation between the set ofdefects and one or more corresponding code components. Other methods ofdetermining the set of fragility data are also possible.

In embodiments, the set of fragility data may indicate a set offragility degree-extents for the set of code components of theapplication program at block 252. A respective subset of the set offragility data may indicate a respective fragility degree-extent for arespective code component of the set of code components of theapplication program. Generally, indicating can include signifying,representing, expressing, or otherwise conveying the set of fragilitydegree-extents. Aspects of the disclosure relate to the recognitionthat, in embodiments, different code components of the set of codecomponents may have different degrees of fragility. Accordingly, aspectsof the disclosure relate to determining a set of fragility dataconfigured to indicate a set of fragility degree-extents for the set ofcode components. The set of fragility degree-extents may include ameasure of how fragile (e.g., proclivity to cause defects/malfunctionswith respect to itself or other code components) a particular codecomponent is. In embodiments, indicating the set of fragility-degreeextents may include labeling one or more code components with aqualitative expression of fragility. For instance, a first codecomponent may be labeled with a tag of “highly fragile” and a secondcode component may be labeled with a tag of “somewhat fragile.” Inembodiments, indicating the set of fragility-degree extents may includeassigning a quantitative score (e.g., fragility score) to express thefragility of a code component. As an example, a first code component maybe assigned a score of 91 (e.g., indicating substantially highfragility) and a second code component may be assigned a score of 11(e.g., indicating substantially low fragility). Other methods ofindicating the set of fragility degree-extents are also possible.

In embodiments, the set of fragility data may indicate a set offragility nature-types for the set of code components of the applicationprogram at block 254. A respective subset of the set of fragility datamay indicate a respective fragility nature-type for a respective codecomponent of the set of code components of the application program.Generally, indicating can include signifying, representing, expressing,or otherwise conveying the set of fragility nature-types. Aspects of thedisclosure relate to the recognition that, in certain embodiments,different code components of the set of code components may beassociated with different types of defects. Accordingly, aspects of thedisclosure relate to determining a set of fragility data configured toindicate a set of fragility nature-types for the set of code components.The set of fragility nature-types may include characteristics,attributes, types, typologies, properties, or other aspects of the setof code components that describe the nature of the fragility (e.g., whatabout it is fragile). In embodiments, indicating the set offragility-nature types may include annotating the set of code components(e.g., in an integrated development environment, source code document)with tags or markers that describe the fragility of the tagged codecomponent. For instance, a first code component may be tagged with awritten description that describes how “Updates freeze at 26%, thenfails.” As another example, a second code component may be tagged with adescription that describes a “High frequency of defects as indicated byapplication-store reviews.” Other methods of indicating the set offragility-nature types for the set of code components are also possible.

At block 270, a set of code components of the application program may bedebugged based on the set of fragility data for the set of codecomponents of the application program. Generally, debugging can includeadjusting, troubleshooting, repairing, fixing, revising, or otherwiseremoving errors from the set of code components. Aspects of thedisclosure relate to the recognition that, in embodiments, the set ofdefect data and the set of test case data may be used to identify bugs(e.g., errors, malfunctions, defects, glitches) in the set of codecomponents. Accordingly, aspects of the disclosure relate to making useof the set of fragility data to perform debug operations for the set ofcode components. In embodiments, debugging may include using thefragility data to identify a subset of code components associated with afragility score above a fragility score threshold (e.g., a codecomponent with a fragility score of 76 may exceed a fragility scorethreshold of 60), and initiating a code diagnostic tool to find andresolve defects of the identified code component. In embodiments,debugging may include comparing the set of fragility data with a set ofdebug criteria for the application program that specifies potentialcauses or suggested debugging procedures for the set of code components.For instance, for an application program associated with a defect of“Application crashes on start-up,” a set of debug criteria may indicatethat an “Application Initiation Memory Value” is set too low. Inembodiments, debugging may include using a tracing technique to run acode component and log information regarding the execution of the codecomponent to ascertain the origin of the defect, and subsequentlyexamining program states (e.g., values of variables, call stacks) forthe application program to discover and resolve the error. Other methodsof debugging the set of code components based on the set of fragilitydata are also possible.

In embodiments, debugging the set of fragility data for the set of codecomponents may include establishing a breakpoint linked with the set ofcode components of the application program at block 272. The breakpointmay be established in an automated fashion based on the set of fragilitydata. Generally, establishing can include instantiating, setting,creating, providing, or generating the breakpoint linked with the set ofcode components. The breakpoint may include an intentional stopping orpausing place in a program, configured to pause operation of theapplication program once triggered. For instance, the breakpoint mayinclude a watchpoint (e.g., type of breakpoint configured to stopexecution of an application when the value of a specified expressionachieves a particular value). In embodiments, the breakpoint may beestablished based on the set of fragility data. For example, the set offragility data (e.g., including the set of test cases) may be analyzedto ascertain one or more locations of a code component that may beassociated with a bug or defect (e.g., a location that was the source ofa defect during development tests), and the breakpoint may beestablished to isolate portions of the code component for testing. Asdescribed herein, establishing the breakpoint may be performedautomatically. For instance, a code diagnostic tool may be configured toexamine the set of fragility data, identify a potential origin locationfor a defect, and establish the breakpoint in association with thepotential defect origin location within a code component. Other methodsof establishing the breakpoint are also possible.

In embodiments, the set of code components of the application programlinked with the breakpoint may be presented in response to triggeringthe breakpoint at block 274. Generally, presenting may includedisplaying, highlighting, marking, indicating, or otherwise providingthe set of code components. Aspects of the disclosure relate to therecognition that triggering of a breakpoint (e.g., watchpoint) mayindicate the presence of a defect or bug associated with the line orlines of code at which the watchpoint was triggered. Accordingly, inembodiments, aspects of the disclosure relate to presenting ordisplaying the code components associated with the watchpoint triggeringin order to identify the source of application program errors. As anexample, a watchpoint may include set of instructions that defines thata particular code component is to run from Line 118 to Line 146, oneline at a time, unless one or more expressions return a value of “False”when executed. Accordingly, the code component may be initiated, and mayproceed without error until Line 131, at which point a value of “False”is returned and the watchpoint is triggered. In response to triggeringthe watchpoint, Line 131 may be highlighted (e.g., in red or yellow) toindicate the potential presence of an error or defect. Other methods ofpresenting the set of code components linked to the breakpoint inresponse to triggering of the breakpoint are also possible.

In embodiments, the set of code components of the application programmay be modified at block 276. Generally, modifying can includeadjusting, altering, repairing, fixing, revising, or otherwise changingthe set of code components. Aspects of the disclosure relate to therecognition that, in response to identifying one or more locations of acode component that are potentially associated with an error or defect,modifying the set of code components to remove or resolve the error maybe associated with positive impacts to application program performance.In embodiments, modifying the set of code components may includealtering the value of a variable or parameter, adding an additionalexpression or instruction, removing a portion of code, or performinganother action with respect to the set of code components. As anexample, modifying the set of code components may include rewriting theset of code components to remove a variable or line of code that may bethe cause of a defect. Other methods of modifying the set of codecomponents are also possible.

Consider the following example. An application program may include amobile online banking application. A set of defect data including a setof application-store reviews for the online banking application may becollected. In embodiments, the set of defect data may include a firstsubset of 11 reviews that relate to a “Log-In Screen Failure,” a secondsubset of 2 reviews that relate to an “Account Display Error,” and athird subset of 24 reviews that relate to “Account Statement PrintFailure.” Based on the set of defect data, a set of test case data maybe collected. In embodiments, the set of test case data may include asubset of test cases that correspond to each respective defect indicatedby the set of defect data. Based on the set of defect data and the setof test case data, a set of fragility data for the set of codecomponents may be determined. In embodiments, the number of user reviewsand relative severity of each defect may be weighted in order tocalculate a fragility degree-extent for a set of code componentscorresponding to the set of identified defects. For example, in certainembodiments, the “Log-In Screen Failure” defect may be assigned afragility degree-extent of “98” (e.g., high number of reviews and highdegree of relative severity), the “Account Display Error” may beassigned a fragility degree-extent of “44” (e.g., moderately highseverity but few users impacted), and the “Account Statement PrintFailure” may be assigned a fragility degree-extent of “62” (e.g.,relatively low severity but higher number of impacted users). Based onthe set of fragility data, the set of code components that correspond tothe identified defects may be debugged. In embodiments, as describedherein, the set of code components may be debugged in order from highestfragility degree-extent to lowest (e.g., code component associated withthe “Account Display Error” is debugged first, followed by the “AccountStatement Print Failure” second and the “Account Display Error third).Other methods of debugging the set of code components based on the setof fragility data determined from application-store user reviews arealso possible.

In embodiments, the collecting of the set of defect data, the collectingof the set of test case data, the determining of the set of fragilitydata, and other steps described herein may each occur in a dynamicfashion to streamline debugging at block 294. For instance, thecollecting of the set of defect data, the collecting of the set of testcase data, the determining of the set of fragility data, and other stepsdescribed herein may occur in real-time, ongoing, or on-the-fly. As anexample, one or more steps described herein may be performed in anongoing fashion (e.g., defect data may be automatically collected inresponse to detection) in order to streamline (e.g., facilitate,promote, enhance) debugging of the set of code components.

Method 200 concludes at block 299. As described herein, aspects ofmethod 200 relate to determining a degree of fragility of a set of codecomponents based on defect data derived from application-store reviews.Aspects of method 200 may provide performance or efficiency benefits forapplication program reliability. As an example, defects and errors thatare difficult to simulate in development testing environments may bediscovered by users (e.g., in real-world contexts), detected based onuser reviews, and resolved using debugging techniques to facilitateapplication program usability. Altogether, leveraging application-storereviews for determining the degree of fragility of code components maybe associated with defect identification, code debugging efficiency, andapplication program reliability.

FIG. 3 is a flowchart illustrating a method 300 for debugging a set ofcode components of an application program. Aspects of FIG. 3 relate toperiodically synchronizing with application-store review updates tore-compute fragility data for a set of code components. Aspects of thedisclosure relate to the recognition that, as new application-storereviews for an application program become available (e.g., are posted),it may be desirable to revise the fragility data for the code componentsof the application program to reflect the updated user reviews.Accordingly, aspects of the disclosure relate to periodically receivingupdated defect data (e.g., corresponding to a set of code components),determining updated fragility data based on the updated defect data, anddebugging the set of code components based on the set of updatedfragility data. The method 300 may begin at block 301.

In embodiments, a set of updated defect data may be retrieved at block320. The set of updated defect data may be retrieved using asynchronization technique. Generally, retrieving can include acquiring,gathering, fetching, aggregating, or accumulating the set of updateddefect data. The set of updated defect data may include information thatindicates errors, bugs, glitches, or other irregularities of theapplication program. The set of updated defect data may includeinformation that has been revised or modified with respect to the set ofdefect data, or information that was not included in the original set ofdefect data. In embodiments, retrieving the set of updated defect datamay include using a synchronization technique. The synchronizationtechnique may include a parameter, triggering condition, or thresholdconfigured such that when achieved, retrieval of the set of updateddefect data may be initiated. As examples, the synchronization techniquemay include a temporal period (e.g., retrieve every 30 minutes, 1 hour,2 days), an ongoing collection protocol (e.g., new data is collected assoon as it becomes available), a code modification (e.g., retrieve datain response to/subsequent to changes to one or more code components), athreshold data increase (e.g., user reviews above a threshold number,text characters above a threshold amount), or the like. For instance,retrieving the set of updated defect data may include monitoring theapplication store, and importing new user reviews for an applicationprogram once each day. Other methods of retrieving the set of updateddefect data using the synchronization technique are also possible.

In embodiments, a set of updated fragility data for the set of codecomponents of the application program may be determined using both theset of updated defect data and the set of test case data at block 340.Generally, determining can include computing, calculating, formulating,generating, or otherwise ascertaining the set of updated fragility data.In embodiments, the set of updated fragility data may include a modifiedor revised indication of the sensitivity, proclivity to malfunction, orerror frequency of the set of code components. In embodiments,determining the set of updated fragility data may include evaluating theset of updated defect data, and identifying a set of test case data(e.g., additional/new test cases) that correspond to the defectsindicated by the set of updated defect data. Using the set of updateddefect data and the set of test case data, the set of updated fragilitydata may be computed to provide a revised (e.g., up-to-date) indicationof the fragility of the set of code components. As an example, inresponse to retrieving a set of updated defect data that indicates arecent code update has resulted in a multitude of new errors withrespect to the “Settings” screen of an application program, a set ofupdated fragility data may be determined that indicates “high fragility”with respect to the code components corresponding to the “Settings”screen. Other methods of determining the set of updated fragility dataare also possible.

In embodiments, the set of code components of the application programmay be debugged based on the set of updated fragility data at block 360.Generally, debugging can include troubleshooting, adjusting, repairing,fixing, revising, or otherwise removing errors from the set of codecomponents. In embodiments, debugging may include using the set ofupdated fragility data to identify a new subset of code componentsassociated with errors or defects, and initiating a code diagnostic toolto locate and resolve the defects of the identified code components. Forinstance, the set of updated fragility data may be used to identify codecomponents associated with fragility degree-extents that achieve athreshold, or changed by a threshold degree (e.g., with respect to theoriginal set of fragility data) for prioritized debugging. Inembodiments, debugging may include analyzing a set of error messagesindicated by the set of fragility data, and ascertaining particularlines or portions of code that may be the cause of the defect. Theascertained locations may then be parsed for uninitialized variables,invalid functions or other potential errors, and resolved. Other methodsof debugging the set of code components based on the set of updatedfragility data are also possible. The method 300 may end at block 399.

FIG. 4 is a flowchart illustrating a method 400 for debugging a set ofcode components of an application program. Aspects of FIG. 4 relate tocorrelating a set of code components with a set of defects using a setof test cases. Aspects of the disclosure relate to the recognition that,in some situations, application program defects indicated by applicationstore reviews may not designate the specific code component associatedwith the defect (e.g., users of the application may not be familiar withthe underlying code structure of applications). Accordingly, aspects ofthe disclosure relate to using test cases (e.g., based on developmenttests of the application program) to facilitate mapping of defect datawith a set of code components (e.g., for debugging). In embodiments,aspects of method 400 may substantially correspond to embodimentsdescribed herein and illustrated in the FIGS. 1-9. At block 410, a setof defect data may be collected. At block 430, a set of test case datamay be collected. At block 450, a set of fragility data may becollected. At block 470, the set of code components may be debugged. Themethod may begin at block 401.

In embodiments, the set of code components may be correlated with theset of defects at block 452. A respective code component may correlatewith a respective defect. Generally, correlating can include mapping,linking, connecting, coupling, coordinating, or otherwise associatingthe set of code components with the set of defects. Aspects of thedisclosure relate to the recognition that, in some situations, a defectindicated by the set of defect data may be related to (e.g., the resultof) an error in a code component of the application program.Accordingly, aspects of the disclosure relate to correlating one or moredefects of the set of defect data with a respective code component. Inembodiments, correlating the set of code components with the set ofdefects may include mapping each defect indicated by the set of defectdata to a particular code component of the set of code components. Forinstance, defects may be mapped to code components that are associatedwith a high likelihood of being the cause of the defect. As an example,for a defect of “Application fails to start up,” a code component of“Application Initiation Protocol” may be identified and mapped to thedefect. Other methods of correlating the set of code components with theset of defects are also possible.

In embodiments, the set of fragility data may be correlated with the setof defect data at block 454. A respective subset of the set of fragilitydata may correlate with a respective subset of the set of defect data,where the respective subset of the set of fragility data correspondswith the respective code component and the respective subset of the setof defect data corresponds with the respective defect. Generally,correlating can include mapping, linking, connecting, coupling,coordinating, or otherwise associating the set of fragility data withthe set of defect data. As described herein, aspects of the disclosurerelate to the recognition that characteristics, properties, orattributes of the set of defect data (e.g., collected fromapplication-store user reviews) may indicate the fragility of one ormore code components. Accordingly, aspects of the disclosure relate tocorrelating a subset of the set of defect data with a respective subsetof the set of fragility data. In embodiments, correlating may includelinking a portion of the defect data that describes a particular bug orerror with a portion of fragility data for a code component that may beassociated with the bug or error (e.g., code component configured toprovide/run the feature of the application program that encountered theerror). As an example, for a set of defect data that relates to multipleerrors in an application program, a subset of the set of defect datapertaining to a particular error (e.g., login username is not acceptedeven though it is correct) may be linked with a subset of fragility datafor a code component (e.g., fragility data indicating high frequency ofissues related to a “login component”). Other methods of correlating theset of fragility data with the set of defect data are also possible.

In embodiments, the set of defect data may be correlated with the set oftest case data at block 456. A respective subset of the set of test casedata may correspond with a respective test case, the respective subsetof the set of defect data may correlate with the respective subset ofthe set of test case data, and the respective defect may correlate withthe respective test case. Generally, correlating can include mapping,linking, connecting, coupling, coordinating, or otherwise associatingthe set of defect data with the set of test case data. As describedherein, aspects of the disclosure relate to the recognition that theparticular code component associated with a defect in the applicationprogram may not be apparent based on the set of defect data alone (e.g.,users may not specify the code components or source of defects inreviews). Accordingly, in embodiments, aspects of the disclosure relateto mapping the set of defect data with a set of test case data. The setof test case data may include information regarding archived parameterconfigurations in which the application was tested during development.For instance, the set of test case data may describe errors or defectsthat were encountered while the application program was in development,as well as the parameter configurations under which those defects arose.In embodiments, correlating the set of defect data with the set of testcase data may include using a natural language processing technique toextract semantic and syntactic information from the set of defect data,and subsequently using the extracted semantic and syntactic informationto ascertain a set of test cases that relate to the set of defects(e.g., semantic or syntactic similarity between the set of defect dataand the set of test case data). For instance, a subset of defect datathat describes a defect of “user inputs not registered” may be analyzedand mapped to a test case that states “User input recognition protocolfails upon repeated page refreshes.” Other methods of correlating theset of defect data with the set of test cases are also possible.

In embodiments, the set of test case data may be correlating with theset of fragility data at block 458. The respective subset of the set oftest case data may correlate with the respective subset of the set offragility data, and the respective test case data may correlate with therespective defect. Generally, correlating can include mapping, linking,connecting, coupling, coordinating, or otherwise associating the set oftest case data with the set of fragility data. As described herein,aspects of the disclosure relate to the recognition that, using the setof test case data, a connection may be established between the set ofdefect data and the set of fragility data for a code component. Inembodiments, correlating the set of test case data with the set offragility data may include ascertaining a code component referenced bythe set of test case data, and identifying a set of fragility datacoupled with the ascertained code component. As an example, a set oftest case data that indicates that a defect arose in development withrespect to a code component of a “Data Submission Protocol” may bemapped with a set of fragility data for the Data Submission Protocolcode component. In this way, defects indicated by the set of defect datamay be correlated to code components using the set of test case data.Other methods of correlating the set of test case data with the set offragility data are also possible. The method 400 may end at block 499.

FIG. 5 is a diagram illustrating a correlation 500 between a set ofdefect data and a set of test case data, according to embodiments.Aspects of FIG. 5 relate to identifying a test case collection 540corresponding to a first defect 520 indicated by application-storereviews for an application program. As described herein, users (e.g.,non-developer users) may post reviews for an application program on anapplication-store that distributes the application program. Inembodiments, the reviews may describe errors, glitches, or other defectsthat users have encountered while using the application program. Forinstance, a user may post a review describing a first defect 520 inwhich a weather forecast application does not display up-to-date weatherinformation. Based on the first defect 520 described in the user review,a test case collection 540 may be identified. The test case collection540 may include a set of test cases that describe development testenvironments in which defects or errors similar to the first defect 520arose. For instance, the test case collection may include a Test-Case 4541 that describes an error in which the weather forecast applicationdid not refresh when the location registered for a user in a userprofile did not match the current location of the user (e.g., asdetected by global positioning techniques). Other types of correlationsbetween the set of defect data and the set of test case data are alsopossible.

FIG. 6 is a diagram illustrating a correlation 600 between a set ofdefect data and a set of code components using a set of test case data,according to embodiments. Aspects of FIG. 6 relate to mapping a firstdefect 620 to a set of code components 660 based on a test-casecollection 640. As described herein, a test case collection 640 may beidentified based on a first defect 620 described in an application-storereview (e.g., set of defect data). Aspects of the disclosure, inembodiments, relate to using the test case collection 640 to determine aset of code components 660 that correspond to the first defect 620. Inembodiments, one or more test cases of the test case collection 640 mayindicate a particular code component. For instance, the test case mayreference a code component that was tested as part of a developmentsoftware testing process. In certain embodiments, the test cases mayinclude executable program code configured to run code components fortesting. Consider the following example. A first defect 620 may relateto a defect of “Music app plays tracks in playlist order even whenshuffle is pressed.” A test case collection 640 of test cases thatrelate to the first defect 620 may be identified (e.g., test cases thatrelate to playlist order, the shuffle feature). In embodiments, aTest-Case 4 641 may be executed, and instrumentation techniques may beused to record which code components are executed during test-caseplayback. For instance, playback of Test-Case 641 may indicate that aCode Component 3 661 related to “Track Randomization” was invokedAccordingly, Code Component 3 661 may be identified as a possible sourceof the first defect 620. In this way, a correlation between a set ofdefect data and a set of code components may be established using a setof test case data. Other types of correlations between the set of defectdata and the set of code components are also possible.

FIG. 7 is a flowchart illustrating a method 700 for debugging a set ofcode components of an application program. Aspects of FIG. 7 relate toresolving a set of fragility data for a set of code components of anapplication program. As described herein, aspects of the disclosurerelate to identifying attributes of a set of defect data (e.g., widgetelement identifiers) and a set of test case data (e.g., clear-scriptelements) for use in determining a set of fragility data for the set ofcode components. In embodiments, aspects of method 700 may substantiallycorrespond to embodiments described herein and illustrated in the FIGS.1-9. At block 710, a set of defect data may be collected. At block 730,a set of test case data may be collected. At block 750, a set offragility data may be collected. At block 770, the set of codecomponents may be debugged. The method may begin at block 701.

In embodiments, the set of defect data and the set of test case data maybe compared to determine the set of fragility data for the set of codecomponents of the application program at block 752. The set of defectdata and the set of test case data may be compared using a semanticanalysis technique. The semantic analysis technique may include one ormore natural language processing techniques (e.g., topic segmentationand recognition, sentiment analysis, parsing, coreference resolution,context interpretation) configured to derive meaning from naturallanguage content. Generally, comparing can include contrasting,juxtaposing, investigating, assessing, or otherwise examining the set ofdefect data with respect to the set of test case data. In embodiments,comparing can include utilizing the semantic analysis technique to parsethe set of defect data (e.g., user-posted application-store reviews) aswell as the set of test case data (e.g., archived, written descriptionsof development test configurations), and assessing the semanticsimilarity between the set of defect data and the set of test case data(e.g., semantic similarity between the set of defect data and the set oftest case data achieves a semantic similarity threshold value). Thesemantic similarity may be evaluated based on the keywords, context,topics, relationships between entities/concepts, and other naturallanguage information for the set of defect data and the set of test casedata. As described herein, in response to determining semanticsimilarity between the set of defect data and the set of test case data,defects of the set of defect data may be correlated with code componentsindicated by the set of test case data, and the set of fragility datafor the set of code components may be computed. Other methods ofcomparing the set of defect data and the set of test case data todetermine the set of fragility data are also possible.

In embodiments, a set of widget element identifiers which indicates theset of defects may be identified at block 754. The set of widget elementidentifiers may be identified with respect to the set of defect data.Generally, identifying can include recognizing, discovering,distinguishing, detecting, ascertaining, or otherwise determining theset of widget element identifiers. The set of widget element identifiersmay include the aspects, features, or components of the applicationprogram that are associated with one or more defects, errors, orirregularities. In embodiments, identifying the set of widget elementidentifiers may include scanning the set of defect data for the names ofparticular elements or features of the application program (e.g.,“Submit button,” “Back Arrow,” “Drop-down Menu”), and recognizing thenamed features as the set of widget element identifiers. In embodiments,identifying the set of widget element identifiers may include analyzingscreenshots or videos included in the set of defect data with objectrecognition techniques to extract the aspects or features of theapplication program that are associated with defects. As an example,identifying may include parsing a screenshot of a web browsingapplication with an associated caption that says “Page Reload Buttondoes not actually refresh the page,” and identifying the “Page ReloadButton” as a widget element identifier. Other methods of identifying theset of widget element identifiers are also possible.

In embodiments, a set of clear-script elements which indicates the setof user interface features of the application program may be identifiedat block 756. The set of clear-script elements may be identified withrespect to the set of test case data. Generally, identifying can includerecognizing, discovering, distinguishing, detecting, ascertaining, orotherwise determining the set of clear-script elements. The set ofclear-script elements may include segments of programming code orportions of the set of test case data that indicate interface featuresof the application program. As examples, the set of clear-scriptelements may include code modules that correspond to displaying userprofile pictures, buttons, menus, user input interfaces, or otherinterface features of the application program. In embodiments,identifying the set of clear-script elements may include parsing thecode and written descriptions included in the set of test case data, andextracting the code portions that pertain to implementing user-interfacefeatures of the application program as the set of clear-script elements.As an example, identifying may include analyzing a comment associatedwith a portion of code in a particular test case that states “Script forimplementing on-screen keyboard,” and ascertaining the portion of codeas a clear-script element (e.g., corresponding to implementation of aninterface-feature of an on-screen keyboard). Other methods ofidentifying the set of clear-script elements are also possible.

In embodiments, the set of widget element identifiers and the set ofclear-script elements may be mapped to determine the set of fragilitydata for the set of code components of the application program at module758. Mapping the set of widget element identifiers and the set ofclear-script elements may correlate the set of defects and the set ofuser interface features of the application program. Generally, mappingcan include linking, connecting, coupling, coordinating, corresponding,or otherwise associating the set of widget element identifiers and theset of clear-script elements. In embodiments, mapping may includecomparing each widget element identifier of the set of widget elementidentifiers to the set of clear-script elements, and coupling eachwidget element identifier with the clear-script element(s) that relateto implementation of the interface feature with which the widget elementidentifier is associated (e.g., the interface feature associated with adefect). As an example, consider a widget element identifier of “searchbox” (e.g., the search box of an application program is associated witha defect). The widget element identifier of “search box” may be comparedto the set of clear-script elements, and mapped to a clear-scriptelement that includes a script configured to implement the search boxwithin the application program. In this way, the set of defects may becorrelated with the user interface features (and underlying codeelements) with which they are associated. Other methods of mapping theset of widget element identifiers and the set of clear-script elementsare also possible.

In embodiments, the set of fragility data for the set of code componentsof the application program may be resolved using a clustering techniqueat block 760. The clustering technique may be based on the correlationof the set of defects and the set of user interface features. Generally,resolving can include computing, calculating, formulating, generating,ascertaining, or otherwise determining the set of fragility data usingthe clustering technique. The clustering technique may include a methodor algorithm for performing statistical data analysis with respect tothe distribution of defects associated with particular code components.As examples, the clustering technique may include connectivity models(e.g., hierarchical clustering), centroid models (e.g., k-meansclustering), distribution models (e.g., multivariate normaldistributions), density models (e.g., density-based spatial clustering,ordered point identification), subspace models (e.g., co-clustering,biclustering), and the like. In embodiments, resolving the set offragility data using the clustering technique may include analyzing thedistribution of defects with respect to the set of code components(e.g., as indicated by the widget element identifiers-clear-scriptelements mapping), and assigning fragility data (e.g., fragility degreeextents, fragility nature types) to model the relationship between theset of defects and the set of code components (e.g., indicate which codecomponents are associated with a greater frequency of errors, moresevere errors, or the like). Other methods of resolving the set offragility data using the clustering technique are also possible. Themethod 700 may end at block 799.

FIG. 8 is a diagram illustrating a defect-code component distribution800, according to embodiments. Aspects of FIG. 8 relate to using aclustering technique to resolve a set of fragility data for a set ofcode components. As described herein, the clustering technique may beused to model the distribution of application program defects (e.g.,identified using a set of defect data mined from application-store userreviews) with respect to code components. Based on the distribution ofdefects with respect to code components, a set of fragility data may bedetermined for the set of code components (e.g., code components with agreater density of defects may be identified as “fragile”). Forinstance, in embodiments, a set of widget element identifiers may begraphed on a horizontal axis 840 of the distribution 800, and a set ofclear scripts may be graphed on a vertical axis 830. In this way, pointson the distribution 800 may represent test cases 810 (e.g., indicated bythe set of clear-script elements) and corresponding defects (e.g.,indicated by the set of widget element identifiers). Accordingly,clusters such as cluster 820 may indicate a collection of test-casesthat have a high correlation with respect to a particular defect. Forinstance, a first cluster may indicate defects associated with a “log-ininterface,” a second cluster may indicate defects associated with a“user profile screen,” and a third cluster may indicate defectsassociated with a “data submission button.” As described herein,fragility data may be resolved for the set of code components based onthe distribution of defects for code components (e.g., clusters withgreater density may be assigned greater fragility scores). Other typesof clustering techniques are also possible.

FIG. 9 is a flowchart illustrating a method 900 for debugging a set ofcode components of an application program. Aspects of FIG. 9 relate toprioritizing debugging of particular subsets of code components based onfragility scores. As described herein, aspects of the disclosure relateto the recognition that, in some situations, it may be desirable toprioritize debugging of code components that are considered to be morefragile (e.g., error prone). Accordingly, aspects of the disclosurerelate to computing fragility scores for respective code components, andprioritizing debugging of one or more code components that areassociated with higher fragility scores. In embodiments, aspects ofmethod 900 may substantially correspond to embodiments described hereinand illustrated in the FIGS. 1-9. At block 910, a set of defect data maybe collected. At block 930, a set of test case data may be collected. Atblock 950, a set of fragility data may be collected. At block 970, theset of code components may be debugged. The method may begin at block901.

In embodiments, it may be selected to prioritize debugging of a firstsubset of the set of code components of the application program withrespect to a second subset of the set of code components of theapplication program based on the set of fragility data at block 952. Thefirst subset of the set of code components may be associated with afirst subset of the set of fragility data, and the second subset of theset of code components may be associated with a second subset of the setof fragility data. Generally, selecting can include choosing, electing,ascertaining, resolving, or determining to prioritize debugging of thefirst subset of the set of code components. In embodiments, selecting toprioritize debugging of a subset of the set of code components mayinclude assigning an order number to one or more code components todefine a sequence for debugging the set of code components (e.g., firstsubset of the code components will be debugged first, a third subsetwill be debugged second, a second subset will be debugged third). Inembodiments, selecting to prioritize debugging of a subset of the set ofcode components may include assigning additional resources for use indebugging the subset. As examples, memory resources or processingresources may be allocated for use by a debugger with respect todebugging a particular subset of code components. In certain situations,selecting to prioritize debugging may include requesting a number ofdebuggers (e.g., additional human workers for debugging), a particulardebugging methodology, or an amount of hours (e.g., work-hours) fordebugging the subset of the set of code components. Other methods ofprioritizing debugging a particular subset of the set of code componentsare also possible.

In embodiments, a first fragility score may be calculated for a firstsubset of the set of code components and a second fragility score may becalculated for a second subset of the set of code components at block954. The first and second fragility scores may be calculated using a setof parameters having values which relate to respective subsets of theset of code components of the application program. Generally, computingcan include calculating, estimating, deriving, formulating, or otherwiseascertaining the first fragility score for the first subset of the setof code components and the second fragility score for the second subsetof the set of code components. The fragility scores may includequantitative indications of the degree of fragility (e.g., sensitivity,proclivity to malfunction, error frequency) of one or more codecomponents. As an example, the fragility scores may be expressed asintegers between 0 and 100, where greater values indicate relativelyhigher degrees of fragility. In embodiments, computing the fragilityscores may include analyzing the relationships between one or moredefects and respective code components, and generating the fragilityscores based on the frequency of defects with respect to a particularcode component, severity of defects with respect to the code component,the priority (e.g., importance, relevance) of the code component withrespect to the application program as a whole, or other parameters thatdefine the relationship between the defect and the code component. As anexample, a first subset of the set of code components may be associatedwith a critical error (e.g., such that the application program will notfunction). Based on the severity of the defect, a first fragility scoreof 94 may be computed for the first subset of the set of codecomponents. As another example, a second subset of the set of codecomponents may be associated with a display malfunction error (e.g., auser interface element does not display correctly, although properfunctionality is maintained). In embodiments, a second fragility scoreof 51 may be computed for the second subset of the set of codecomponents (e.g., the nature of the defect is less severe than thecritical error of the first subset). Other methods of computing thefirst and second fragility scores are also possible.

In embodiments, it may be ascertained that the first fragility scoreexceeds the second fragility score at block 956. Ascertaining that thefirst fragility score exceeds the second fragility score may beperformed by comparing the first and second fragility scores. Generally,ascertaining can include discovering, detecting, confirming, verifying,or otherwise determining that the first fragility score exceeds (e.g.,is greater than) the second fragility score. In embodiments,ascertaining may include contrasting the magnitude of the firstfragility score and the second fragility score, and detecting that thefirst fragility score is greater than the second fragility score. As anexample, in response to comparing a first fragility score of 86 and asecond fragility score of 41, it may be determined that the firstfragility score exceeds the second fragility score. As described herein,based on the relation between the first fragility score and the secondfragility score, debugging of one or more code components may beprioritized. Other methods of ascertaining that the first fragilityscore exceeds the second fragility score are also possible.

As described herein, computation of the first and second fragilityscores may be based on a set of parameters having sets of values whichrelate to particular subsets of the set of code components. Inembodiments, the set of parameters may include a defect correlationfactor. The defect correlation factor may include a measure of howclosely (e.g., accurately, precisely) the nature of a particular defectcorresponds with a code component. For example, a defect of “applicationcrashes on start-up” may be associated with a high defect correlationfactor with respect to a code component configured to allocate memoryfor application initialization, but a relatively low defect correlationfactor with respect to a code component configured to display a logoffscreen. In embodiments, the set of parameters may include a test-casecorrelation factor. The test-case correlation factor may include ameasure of how closely the nature of a test-case corresponds with a codecomponent or a defect. For instance, a test-case that calls a “datasubmission protocol” upon execution may be associated with a hightest-case correlation factor with respect to a defect pertaining to“Profile Data Submission Failure,” but a low test-case correlationfactor with respect to a defect pertaining to “application will notimport pictures from camera.”

In embodiments, the set of parameters may include a defect criticalityfactor. The defect criticality factor may include a measure of howserious, severe, or urgent a particular defect is. For example, a defectof “application freezes device upon initiation” may be associated with ahigh defect criticality factor, while a defect of “sad face emoticondoes not display correctly” may be associated with a low defectcriticality factor. In embodiments, the set of parameters may include acluster density factor. The cluster density factor may include a measureof the total number or frequency of defects associated with a particularcode component. For instance, a code component that is associated with40 defects in a two-hour time period may considered to have a highcluster density factor, while a code component that has 2 defects over a1 year period may be considered to have a low cluster density factor. Inembodiments, the set of parameters may include a cluster correlationfactor. The cluster correlation factor may include a measure of howclosely a cluster of defects and code components relate to otherdefect-code component clusters. For instance, clusters that have similardefect occurrence frequencies, defect severities, or other attributesmay be considered to have high cluster correlation factors (e.g., andmay be assigned similar fragility scores).

In embodiments, the set of parameter values may include a code componentcomplexity factor. The code component complexity factor may include ameasure of how elaborate, complicated, or intricate a particular codecomponent is. For instance, code components that include a number oflines, variables, functions, loops, if-then statements, or parametersabove a threshold value may be considered to have a high code componentcomplexity factors. In embodiments, the set of parameter values mayinclude a code interdependency factor. The code interdependency factormay include a measure of how dependent (e.g., reliant, contingent) aparticular code component is on external elements. For example, aparticular code component that invokes a number of modules or scriptsthat are external to the code component may be considered to have a highcode interdependency factor. As described herein, one or more of the setof parameters may be used to facilitate computation of the set offragility scores. Other types of parameters are also possible.

In addition to embodiments described above, other embodiments havingfewer operational steps, more operational steps, or differentoperational steps are contemplated. Also, some embodiments may performsome or all of the above operational steps in a different order. Themodules are listed and described illustratively according to anembodiment and are not meant to indicate necessity of a particularmodule or exclusivity of other potential modules (or functions/purposesas applied to a specific module).

In the foregoing, reference is made to various embodiments. It should beunderstood, however, that this disclosure is not limited to thespecifically described embodiments. Instead, any combination of thedescribed features and elements, whether related to differentembodiments or not, is contemplated to implement and practice thisdisclosure. Many modifications and variations may be apparent to thoseof ordinary skill in the art without departing from the scope and spiritof the described embodiments. Furthermore, although embodiments of thisdisclosure may achieve advantages over other possible solutions or overthe prior art, whether or not a particular advantage is achieved by agiven embodiment is not limiting of this disclosure. Thus, the describedaspects, features, embodiments, and advantages are merely illustrativeand are not considered elements or limitations of the appended claimsexcept where explicitly recited in a claim(s).

The present invention may be a system, a method, and/or a computerprogram product. The computer program product may include a computerreadable storage medium (or media) having computer readable programinstructions thereon for causing a processor to carry out aspects of thepresent invention.

The computer readable storage medium can be a tangible device that canretain and store instructions for use by an instruction executiondevice. The computer readable storage medium may be, for example, but isnot limited to, an electronic storage device, a magnetic storage device,an optical storage device, an electromagnetic storage device, asemiconductor storage device, or any suitable combination of theforegoing. A non-exhaustive list of more specific examples of thecomputer readable storage medium includes the following: a portablecomputer diskette, a hard disk, a random access memory (RAM), aread-only memory (ROM), an erasable programmable read-only memory (EPROMor Flash memory), a static random access memory (SRAM), a portablecompact disc read-only memory (CD-ROM), a digital versatile disk (DVD),a memory stick, a floppy disk, a mechanically encoded device such aspunch-cards or raised structures in a groove having instructionsrecorded thereon, and any suitable combination of the foregoing. Acomputer readable storage medium, as used herein, is not to be construedas being transitory signals per se, such as radio waves or other freelypropagating electromagnetic waves, electromagnetic waves propagatingthrough a waveguide or other transmission media (e.g., light pulsespassing through a fiber-optic cable), or electrical signals transmittedthrough a wire.

Computer readable program instructions described herein can bedownloaded to respective computing/processing devices from a computerreadable storage medium or to an external computer or external storagedevice via a network, for example, the Internet, a local area network, awide area network and/or a wireless network. The network may comprisecopper transmission cables, optical transmission fibers, wirelesstransmission, routers, firewalls, switches, gateway computers and/oredge servers. A network adapter card or network interface in eachcomputing/processing device receives computer readable programinstructions from the network and forwards the computer readable programinstructions for storage in a computer readable storage medium withinthe respective computing/processing device.

Computer readable program instructions for carrying out operations ofthe present invention may be assembler instructions,instruction-set-architecture (ISA) instructions, machine instructions,machine dependent instructions, microcode, firmware instructions,state-setting data, or either source code or object code written in anycombination of one or more programming languages, including an objectoriented programming language such as Java, Smalltalk, C++ or the like,and conventional procedural programming languages, such as the “C”programming language or similar programming languages. The computerreadable program instructions may execute entirely on the user'scomputer, partly on the user's computer, as a stand-alone softwarepackage, partly on the user's computer and partly on a remote computeror entirely on the remote computer or server. In the latter scenario,the remote computer may be connected to the user's computer through anytype of network, including a local area network (LAN) or a wide areanetwork (WAN), or the connection may be made to an external computer(for example, through the Internet using an Internet Service Provider).In some embodiments, electronic circuitry including, for example,programmable logic circuitry, field-programmable gate arrays (FPGA), orprogrammable logic arrays (PLA) may execute the computer readableprogram instructions by utilizing state information of the computerreadable program instructions to personalize the electronic circuitry,in order to perform aspects of the present invention.

Aspects of the present invention are described herein with reference toflowchart illustrations and/or block diagrams of methods, apparatus(systems), and computer program products according to embodiments of theinvention. It will be understood that each block of the flowchartillustrations and/or block diagrams, and combinations of blocks in theflowchart illustrations and/or block diagrams, can be implemented bycomputer readable program instructions.

These computer readable program instructions may be provided to aprocessor of a general purpose computer, special purpose computer, orother programmable data processing apparatus to produce a machine, suchthat the instructions, which execute via the processor of the computeror other programmable data processing apparatus, create means forimplementing the functions/acts specified in the flowchart and/or blockdiagram block or blocks. These computer readable program instructionsmay also be stored in a computer readable storage medium that can directa computer, a programmable data processing apparatus, and/or otherdevices to function in a particular manner, such that the computerreadable storage medium having instructions stored therein comprises anarticle of manufacture including instructions which implement aspects ofthe function/act specified in the flowchart and/or block diagram blockor blocks.

The computer readable program instructions may also be loaded onto acomputer, other programmable data processing apparatus, or other deviceto cause a series of operational steps to be performed on the computer,other programmable apparatus or other device to produce a computerimplemented process, such that the instructions which execute on thecomputer, other programmable apparatus, or other device implement thefunctions/acts specified in the flowchart and/or block diagram block orblocks.

Embodiments according to this disclosure may be provided to end-usersthrough a cloud-computing infrastructure. Cloud computing generallyrefers to the provision of scalable computing resources as a serviceover a network. More formally, cloud computing may be defined as acomputing capability that provides an abstraction between the computingresource and its underlying technical architecture (e.g., servers,storage, networks), enabling convenient, on-demand network access to ashared pool of configurable computing resources that can be rapidlyprovisioned and released with minimal management effort or serviceprovider interaction. Thus, cloud computing allows a user to accessvirtual computing resources (e.g., storage, data, applications, and evencomplete virtualized computing systems) in “the cloud,” without regardfor the underlying physical systems (or locations of those systems) usedto provide the computing resources.

Typically, cloud-computing resources are provided to a user on apay-per-use basis, where users are charged only for the computingresources actually used (e.g., an amount of storage space used by a useror a number of virtualized systems instantiated by the user). A user canaccess any of the resources that reside in the cloud at any time, andfrom anywhere across the Internet. In context of the present disclosure,a user may access applications or related data available in the cloud.For example, the nodes used to create a stream computing application maybe virtual machines hosted by a cloud service provider. Doing so allowsa user to access this information from any computing system attached toa network connected to the cloud (e.g., the Internet).

Embodiments of the present disclosure may also be delivered as part of aservice engagement with a client corporation, nonprofit organization,government entity, internal organizational structure, or the like. Theseembodiments may include configuring a computer system to perform, anddeploying software, hardware, and web services that implement, some orall of the methods described herein. These embodiments may also includeanalyzing the client's operations, creating recommendations responsiveto the analysis, building systems that implement portions of therecommendations, integrating the systems into existing processes andinfrastructure, metering use of the systems, allocating expenses tousers of the systems, and billing for use of the systems.

The flowchart and block diagrams in the Figures illustrate thearchitecture, functionality, and operation of possible implementationsof systems, methods, and computer program products according to variousembodiments of the present invention. In this regard, each block in theflowchart or block diagrams may represent a module, segment, or portionof instructions, which comprises one or more executable instructions forimplementing the specified logical function(s). In some alternativeimplementations, the functions noted in the block may occur out of theorder noted in the figures. For example, two blocks shown in successionmay, in fact, be executed substantially concurrently, or the blocks maysometimes be executed in the reverse order, depending upon thefunctionality involved. It will also be noted that each block of theblock diagrams and/or flowchart illustration, and combinations of blocksin the block diagrams and/or flowchart illustration, can be implementedby special purpose hardware-based systems that perform the specifiedfunctions or acts or carry out combinations of special purpose hardwareand computer instructions.

While the foregoing is directed to exemplary embodiments, other andfurther embodiments of the invention may be devised without departingfrom the basic scope thereof, and the scope thereof is determined by theclaims that follow. The descriptions of the various embodiments of thepresent disclosure have been presented for purposes of illustration, butare not intended to be exhaustive or limited to the embodimentsdisclosed. Many modifications and variations will be apparent to thoseof ordinary skill in the art without departing from the scope and spiritof the described embodiments. The terminology used herein was chosen toexplain the principles of the embodiments, the practical application ortechnical improvement over technologies found in the marketplace, or toenable others of ordinary skill in the art to understand the embodimentsdisclosed herein.

The terminology used herein is for the purpose of describing particularembodiments only and is not intended to be limiting of the variousembodiments. As used herein, the singular forms “a,” “an,” and “the” areintended to include the plural forms as well, unless the context clearlyindicates otherwise. “Set of,” “group of,” “bunch of,” etc. are intendedto include one or more. It will be further understood that the terms“includes” and/or “including,” when used in this specification, specifythe presence of the stated features, integers, steps, operations,elements, and/or components, but do not preclude the presence oraddition of one or more other features, integers, steps, operations,elements, components, and/or groups thereof. In the previous detaileddescription of exemplary embodiments of the various embodiments,reference was made to the accompanying drawings (where like numbersrepresent like elements), which form a part hereof, and in which isshown by way of illustration specific exemplary embodiments in which thevarious embodiments may be practiced. These embodiments were describedin sufficient detail to enable those skilled in the art to practice theembodiments, but other embodiments may be used and logical, mechanical,electrical, and other changes may be made without departing from thescope of the various embodiments. In the previous description, numerousspecific details were set forth to provide a thorough understanding thevarious embodiments. But, the various embodiments may be practicedwithout these specific details. In other instances, well-known circuits,structures, and techniques have not been shown in detail in order not toobscure embodiments.

What is claimed is:
 1. A computer-implemented method for debugging a setof code components of an application program, the method comprising:collecting, with respect to the application program, a set of defectdata which indicates a set of defects, wherein the set of defect data isderived from a set of post-compilation users of the application program;collecting, with respect to the application program, a set of test casedata which indicates a set of user interface features of the applicationprogram, wherein the set of test case data is derived from a set ofdevelopment tests of the application program; determining, using boththe set of defect data and the set of test case data, a set of fragilitydata for the set of code components of the application program;debugging, based on the set of fragility data for the set of codecomponents of the application program, the set of code components of theapplication program; correlating the set of code components with the setof defects, wherein a respective code component correlates with arespective defect; correlating the set of fragility data with the set ofdefect data, wherein a respective subset of the set of fragility datacorrelates with a respective subset of the set of defect data, whereinthe respective subset of the set of fragility data corresponds with therespective code component, and wherein the respective subset of the setof defect data corresponds with the respective defect; correlating theset of defect data with the set of test case data, wherein a respectivesubset of the set of test case data corresponds with a respective testcase, wherein the respective subset of the set of defect data correlateswith the respective subset of the set of test case data, and wherein therespective defect correlates with the respective test case; correlatingthe set of test case data with the set of fragility data, wherein therespective subset of the set of test case data correlates with therespective subset of the set of fragility data, and wherein therespective test case correlates with the respective defect; comparing,using a semantic analysis technique, the set of defect data and the setof test case data to determine the set of fragility data for the set ofcode components of the application program; identifying, with respect tothe set of defect data, a set of widget element identifiers whichindicates the set of defects; identifying, with respect to the set oftest case data, a set of clear-script elements which indicates the setof user interface features of the application program; mapping, tocorrelate the set of defects and the set of user interface features ofthe application program, the set of widget element identifiers and theset of clear-script elements to determine the set of fragility data forthe set of code components of the application program; resolving, usinga k-means clustering technique based on the mapping, the set offragility data for the set of code components of the applicationprogram, wherein a respective widget element identifier of the set ofwidget element identifiers maps with a respective clear-script elementof the set of clear-script elements; selecting, based on the set offragility data for the set of code components of the applicationprogram, to prioritize debug of a first subset of the set of codecomponents of the application program with respect to a second subset ofthe set of code components of the application program, wherein the firstsubset of the set of code components is associated with a first subsetof the set of fragility data, and wherein the second subset of the setof code components is associated with a second subset of the set offragility data; debugging, with priority with respect to the secondsubset of the set of code components of the application program, thefirst subset of the set of code components of the application program;computing, using a set of parameters having a first set of values whichrelates to the first subset of the set of code components of theapplication program, a first fragility score for the first subset of theset of code components of the application program; computing, using theset of parameters having a second set of values which relates to thesecond subset of the set of code components of the application program,a second fragility score for the second subset of the set of codecomponents of the application program; ascertaining, by comparing thefirst and second fragility scores, that the first fragility scoreexceeds the second fragility score; retrieving, using a synchronizationcriterion, a set of updated defect data; determining, using both the setof updated defect data and the set of test case data, a set of updatedfragility data for the set of code components of the applicationprogram; and debugging, based on the set of updated fragility data forthe set of code components of the application program, the set of codecomponents of the application program, wherein the set of parametersincludes a defect correlation factor, a test-case correlation factor, adefect criticality factor, a cluster density factor, a clustercorrelation factor, a code component complexity factor, and a codeinterdependency factor.